On Finding Large Conjunctive Clusters
نویسندگان
چکیده
We propose a new formulation of the clustering problem that differs from previous work in several aspects. First, the goal is to explicitly output a collection of simple and meaningful conjunctive descriptions of the clusters. Second, the clusters might overlap, i.e., a point can belong to multiple clusters. Third, the clusters might not cover all points, i.e., not every point is clustered. Finally, we allow a point to be assigned to a conjunctive cluster description even if it does not completely satisfy all of the attributes, but rather only satisfies most. A convenient way to view our clustering problem is that of finding a collection of large bicliques in a bipartite graph. Identifying one largest conjunctive cluster is equivalent to finding a maximum edge biclique. Since this problem is NP-hard [28] and there is evidence that it is difficult to approximate [12], we solve a relaxed version where the objective is to find a large subgraph that is close to being a biclique. We give a randomized algorithm that finds a relaxed biclique with almost as many edges as the maximum biclique. We then extend this algorithm to identify a good collection of large relaxed bicliques. A key property of these algorithms is that their running time is independent of the number of data points and linear in the number of attributes.
منابع مشابه
Faster Full Text Search through Document Clustering Diploma Thesis
Fast and easy access to information has become a keystone in our fast-paced world. Full text search remains an important technique in the area of information retrieval and excels wherever fast access to large amounts of text is a prime concern. Improving full text search is still an active research area. Faster full text search leads to higher throughput, reduced hardware costs and an overall i...
متن کاملThe Feasibility of the Conjunctive Use of Surface and Groundwater Resources in Dehloran Plain by Using the MODFLOW Model
Lack of the proper conjunctive use of surface and groundwater resources causes large water stresses in one of these resources. Conjunctive use of surface and groundwater, especially in arid and semi-arid regions, is a scientific and practical solution for sustainable water resources management. The aim of this research was to prepare some mathematical modeling to apply the conjunctive use of su...
متن کاملSimulating and Optimizing the Conjunctive Use of Surface and Groundwater Resources Using the System Dynamics Approach (A Case Study: Dashte-Abbas Irrigation Network)
The construction of irrigation network and the water transfer from Karkheh Dam to Dashte-Abbas, due to neglecting the groundwater resources has increased groundwater level and waterlogging of the agricultural land in the recent years. The aim of this study was, therefore, to optimize the conjunctive use of surface and groundwater resources in Dashte-Abbas to minimize waterlogging problems and a...
متن کاملSolving Satisfiability Problem by Computing Stable Sets of Points in Clusters
Earlier we introduced the notion of a stable set of points (SSP) and showed that a CNF formula is unsatisfiable iff there is a set of points (i.e. complete assignments) that is stable. Experiments showed that SSPs of CNF formulas of practical interest are very big. So computing an SSP of a CNF formula point by point is, in general, infeasible. In this report, we show how an SSP can be computed ...
متن کاملA New Approach in Strategy Formulation using Clustering Algorithm: An Instance in a Service Company
The ever severe dynamic competitive environment has led to increasing complexity of strategic decision making in giant organizations. Strategy formulation is one of basic processes in achieving long range goals. Since, in ordinary methods considering all factors and their significance in accomplishing individual goals are almost impossible. Here, a new approach based on clustering method is pro...
متن کامل